Evaluating Term Extraction Methods for Interpreters

نویسندگان

  • Ran Xu
  • Serge Sharoff
چکیده

The study investigates term extraction methods using comparable corpora for interpreters. Simultaneous interpreting requires efficient use of highly specialised domain-specific terminology in the working languages of an interpreter with limited time to prepare for new topics. We evaluate several terminology extraction methods for Chinese and English using settings which replicate real-life scenarios, concerning the task difficulty, the range of terms and the amount of materials available, etc. We also investigate interpreters’ perception on the usefulness of automatic termlists. The results show the accuracy of the terminology extraction pipelines is not perfect, as their precision ranges from 27% on short texts to 83% on longer corpora for English, 24% to 31% on Chinese. Nevertheless, the use of even small corpora for specialised topics greatly facilitates interpreters in their preparation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Specialized Corpora from the Web and Term Extraction for Simultaneous Interpreters

There is no doubt that the Web is a mine of language data of unprecedented richness and ease of access (Kilgarriff and Grefenstette 2003). As more people use the Web for more tasks, it provides an increasingly representative machine-readable sample of interests and activity in the world (Henzinger and Lawrence 2004). Despite some drawbacks, the Web is an immense source of disposable corpora (Va...

متن کامل

Database for Evaluating Extracted Terms and Tool for Visualizing the Terms

We constructed a database that can be used to evaluate term extraction. It can be used to calculate precision and recall rates because it is very exhaustive. We did experiments on term extraction and compared various kinds of methods. We also applied automatic application tools to extract terms that displayed the results in the form of a two-dimensional figure in about 20 seconds.

متن کامل

Optimization of Colchicine Extraction from Colchicum Kurdicum (Bornm.) Stef. Corm and Evaluating Anti-Inflammatory and Anti-Oxidant Activities of the Plant Extract

Background and purpose: Colchicum kurdicum (Bornm.) Stef. is a monocotyledon plant which is endemic to Iran. The corm and seeds of this plant have some bioactive compounds, especially tropolone alkaloids that are used in treatment of inflammations, rheumatoid arthritis, gout, joint pains, and cancers. This study aimed at optimization of colchicine extraction from the corms of C. kurdicum and ev...

متن کامل

Specialized Corpora from the Web and Terms Extraction for Simultaneous Interpreters

This paper presents the results of an experiment conducted using BootCaT, a toolkit to bootstrap specialized corpora and terms from the web. In order to evaluate the differences and similarities between automatically and manual constructed corpora, we compare the results of a terminological extraction from two corpora, the first retrieved by BootCaT and the second constructed by a professional ...

متن کامل

Comparative Information Extraction from Sar and Optical Imagery

Presently EuroSDR, controlled by the Technical University of Berlin, is conducting a test on competitive information extraction from state of the art airborne multi-polarised SAR imagery (C, X and L – band) and high resolution optical imagery of the same area. The test envisages 3 stages, namely visual interpretation and map compilation, automatic object extraction and sensor fusion. Some first...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014